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ABSTRACT 

The chromothripsis hypothesis suggests an extraor- 
dinary one-step catastrophic genomic event allow- 
ing a chromosome to 'shatter into many pieces' and 
reassemble into a functioning chromosome. Recent 
efforts have aimed to detect chromothripsis by look- 
ing for a genomic signature, characterized by a large 
number of breakpoints (50-250), but a limited number 
of oscillating copy number states (2-3) confined to a 
few chromosomes. The chromothripsis phenomenon 
has become widely reported in different cancers, 
but using inconsistent and sometimes relaxed crite- 
ria for determining rearrangements occur simultane- 
ously rather than progressively. We revisit the origi- 
nal simulation approach and show that the signature 
is not clearly exceptional, and can be explained us- 
ing only progressive rearrangements. For example, 
3.9% of progressively simulated chromosomes with 
50-55 breakpoints were dominated by two or three 
copy number states. In addition, by adjusting the pa- 
rameters of the simulation, the proposed footprint 
appears more frequently. Lastly, we provide an algo- 
rithm to find a sequence of progressive rearrange- 
ments that explains all observed breakpoints from 
a proposed chromothripsis chromosome. Thus, the 
proposed signature cannot be considered a suffi- 
cient proof for this extraordinary hypothesis. Great 
caution should be exercised when labeling complex 
rearrangements as chromothripsis from genome hy- 
bridization and sequencing experiments. 

INTRODUCTION 

In a groundbreaking 2011 study (1), Stephens et al. ob- 
served a pattern of structural variation in a leukemia 
genome so atypical it presumptively revealed a novel mech- 
anism of chromosome rearrangement. Two features dis- 
tinguish this variation pattern. First, the chromosome or 
chromosomal region in question has many clustered break- 
points that suggest complex adjacencies rather than sim- 



ple deletions or non-overlapping tandem duplications. Sec- 
ond, the region oscillates between two or perhaps three copy 
number states. 

To further investigate this phenomenon, Stephens et al. 
sequenced several cell lines with chromosomes that exhib- 
ited these features. One of these chromosomes was chro- 
mosome 15 from SNU-C1, a colon cancer cell line. This 
chromosome has 239 breakpoints identified by paired-end 
sequencing (PES) and mostly oscillates between two copy 
number states, two and four. Using simulations, Stephens 
et al. showed that the progressive introduction of the break- 
points they observed would result in a chromosome with 
many copy number states rather than just two. They hy- 
pothesized that the peculiar rearrangement pattern was not 
the result of progressive rearrangements but instead the re- 
sult of the chromosome shattering followed by the random 
stitching together of the resulting pieces. They termed this 
phenomenon 'chromothripsis'. 

To determine how widespread chromothripsis may be, 
Stephens et al. used the progressive rearrangement simu- 
lation from SNU-C1 to conclude that a chromosome with 
at least 50 breakpoints dominated by at most three copy 
number states was unlikely to have been rearranged progres- 
sively and thus was likely to be a product of chromothripsis. 
Using these criteria they searched copy number profiles and 
estimate 2-3% of cancers have a chromosome that bears the 
hallmark of chromothripsis. 

This is a striking result; it suggests a mechanism of can- 
cer genome evolution that contrasts starkly with previ- 
ously described models. This discovery has generated excite- 
ment and ongoing investigation. Subsequent studies have 
found evidence for chromothripsis in multiple myeloma (2), 
medulloblastoma (3,4), neuroblastoma (5) and colorectal 
cancers (6) as well as the germline (7,8). Moreover in some 
studies, chromothripsis has been associated with more ag- 
gressive cancers. Thus, it would appear that a new source 
of human disease has been found, with potentially far- 
reaching effects on our understanding and treatment of can- 
cer (9). 

The great potential of chromothripsis cannot be realized 
unless it can be accurately detected. It is unlikely that chro- 
mothripsis will ever be reliably observed directly, so we will 
need to rely on the footprint that chromothripsis should 
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leave in copy number and breakpoint data. The characteri- 
zation of this footprint is an open problem. While Stephens 
et al. searched for chromosomes dominated by at most three 
copy number states with at least 50 positions where copy 
number changes, subsequent works have used more relaxed 
criteria. They have required fewer breakpoints per chromo- 
some, such as 20 (5), 10 (3,4) or just a handful (8). They also 
have not always required that the number of unique copy 
states in a chromosome be limited to two or three (4,5,10). 

The validity of these footprints of chromothripsis rests 
on the idea that progressive rearrangement cannot create 
such patterns. However, the evidence for this proposition is 
largely limited to the initial simulation work by Stephens. 
Chromothripsis is now being investigated in different con- 
texts than Stephens' cell line simulations. Furthermore, the 
diversity of approaches used to identify chromothripsis 
means some groups are likely over- or underestimating its 
prevalence. This, together with the potentially great signif- 
icance of chromothripsis, highlights the value of revisiting 
and extending the simulation work that underlies current 
strategies for identifying chromothripsis. 

In this article, we review the simulation approach that 
suggests that progressive rearrangements cannot yield a 
chromosome with many breakpoints and few unique copy 
number states. First, we explore whether changes to the 
implementation of the simulation affect the validity of the 
footprint of chromothripsis. We show that a subtle but con- 
sequential error in the original implementation of the simu- 
lation causes it to understate the breakpoint and copy num- 
ber patterns that can be achieved by progressive rearrange- 
ment. We examine varying possible meanings of 'break- 
point' and 'copy number state' and determine definitions 
that more closely correspond to experimental results. Next, 
we show that progressive rearrangement with a preference 
for inversions can produce chromosomes that bear the puta- 
tive footprint of chromothripsis. Together these issues sug- 
gest that, assuming the simulation approach is valid, more 
stringent criteria must be used to identify chromothripsis 
and that the current literature overstates its prevalence. 

We then demonstrate that the simulation approach pro- 
duces similar results whether a chromosome is progressively 
rearranged or not. This undermines its ability to distin- 
guish between chromothripsis and progressive rearrange- 
ment. Extending on this finding, we demonstrate a method 
that finds plausible progressive rearrangements that explain 
the breakpoints of particular chromosomes that have been 
documented to have undergone chromothripsis. Finally, we 
discuss the significance of these findings and question the 
chromothripsis hypothesis. 

MATERIALS AND METHODS 

Simulating progressive rearrangements 

We will first summarize the simulation method. Consider 
a chromosome 100 bases long that undergoes chromothrip- 
sis, shattering into 10 segments of 10 bases, which we label a 
through j. The segments come back together, but some are 
lost, some are inverted and the order is shuffled. Suppose 
the resulting arrangement of segments is ae(-c)(-g)(-h)j. If 
this chromosome is sequenced, it will reveal five breakpoints 
and copy numbers that alternate between zero and one. The 



breakpoints are shown in Table 1, and an illustration of 
breakpoint positions is shown in Figure la. 

We can now step through the progressive rearrangement 
simulation used by Stephens et al. The simulated chromo- 
some begins intact, with no rearrangements (Figure 2a). 
Then, a random breakpoint is chosen from the set of ob- 
served breakpoints. In this case, suppose the breakpoint be- 
tween 60 and 80 is chosen. This breakpoint is now intro- 
duced into the chromosome via one of three rearrangement 
types: inversion, deletion or tandem duplication. The ob- 
served orientation of the two ends of the breakpoint is head- 
to-tail. So, an inversion cannot be used to create the break- 
point because that will result in segments with orientations 
of head-to-head or tail-to-tail. A deletion between 60 and 80 
will not work because then the orientations would be tail-to- 
head. But, a tandem duplication between 60 and 80 will re- 
sult in a breakpoint that, when read from 60 to 80, will join 
the head of segment g to the tail of segment h. So, segments 
g and h are duplicated. Suppose, the breakpoint between 
20 and 70 is chosen and segments c to the furthest segment 
g are duplicated. Next, the rearrangement between 30 and 
50 is chosen. Using similar reasoning as above, a rearrange- 
ment that reproduces the breakpoint is introduced. Suppose 
the rearrangement chosen is an inversion of segments fghgc. 
Note that this creates two breakpoints, the observed break- 
point plus another one in head-to-head orientation. Then, 
two more rearrangements are introduced resulting in the 
chromosome in Figure 2f. The number of breakpoints and 
copy number states in this chromosome would be recorded, 
and the simulation would be repeated many times with dif- 
ferent rearrangement orders and segment choices. It would 
also be stopped when specific numbers of breakpoints had 
been introduced so that the relationship between the num- 
ber of breakpoints and the number of unique copy number 
states could be determined. 

Finding chromosome rearrangements consistent with ob- 
served breakpoints 

In addition to simulations, we can propose sequences of 
rearrangements that explain all the observed breakpoints. 
Each breakpoint is provided as a pair of coordinates that is 
non-adjacent in the reference genome, but adjacent in the 
donor sample. We first construct a breakpoint graph (11) 
from breakpoints (e.g. Figure lb graph is constructed from 
breakpoints in Table 1): Partition the chromosome of length 
L into n segments based on breakpoint coordinates p\,pz, 

Each segment has two ends labeled by their position on 

the chromosome with the 5' segment marked as head (h) 
and the 3' segment end marked as tail (t). The breakpoint 
graph is described by In nodes, denoted {{po, h), (p\, t), {p\, 
h), (p 2 , t), (p 2n -i,f),(p2»-i, h), (p 2 „, t)}, where p 0 = 1 
and pin — L- Pairs of nodes are connected by segment-edges 
and breakpoint-edges. Segment-edges (shown in grey in Fig- 
ure lb) connect (/?,■_ i, h) to (/?,, t) for all 1 < i < In. Thus, 
every node has exactly one segment edge. Recall that the in- 
put is a set of breakpoints, ((/?,, ej), (Pj, ej)), where p h pj 
correspond to the chromosomal positions that are brought 
together, and e,-, ej are each either head or tail. By definition, 
the breakpoint graph already has the nodes {p h ej) and (pj, 
ej), and we connect each such pair with a breakpoint-edge 
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Figure 1. A hypothetical chromosome of length 100 shattered into blocks of length 10 and then reassembled. The hypothetical rearranged chromosome 
breakpoints are in Table 1. (a) The breakpoints and block copy number projected onto original chromosome, (b) A breakpoint graph representing the 
hypothetical rearranged chromosome with Table 1 breakpoints. 
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Figure 2. Following Stephens et al. simulation procedure, a sequence of possible rearrangements steps to explain the observed breakpoints in Table 1. 
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Table 1. Breakpoints of a rearranged chromosome in Figure 1 
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(shown in green). When constructing the graph from the 
Stephens et al. data, no node had more than one breakpoint 
edge. The observed loss of heterozygosity in chromosomes 
suggested that a single chromosome undergoes rearrange- 
ment. Our method uses the haploid assumption, by finding 
a single path in the breakpoint graph. Alternatively, find- 
ing two disjoint paths in the breakpoint graph corresponds 
to two rearranged chromosomes, without changing the ob- 
served breakpoints. 

Based on sorting of reversals theory (11), a continuous 
path from the start node to the end node reads out a se- 
quence of alternating segment and breakpoint edges repre- 
senting the rearranged chromosome. Given the continuous 
path, there is always a sequence of reversals (inversions) that 
transforms the rearranged chromosome back into the orig- 
inal segment ordering. In Figure lb, the only graph com- 
ponent of this type is the path from (l,h) to (100,t). If the 
set of observed breakpoints is not complete, we will have 
multiple connected components. This is remedied by chain- 
ing paths together with new breakpoint edges. The termini 
of paths with breakpoint edges are nodes with a single out- 
going segment edge. To create a full path between chromo- 
some start and end nodes, begin with a path containing the 
start node and connect the non-start terminus to the ter- 
minus of a randomly selected path by adding a new break- 
point edge. Continue the process of joining paths until all 
paths with observed breakpoint edges are consumed and 
finishing with the path containing the end node. With the 
continuous path as input, GRIMM (12) finds a sequence 
of inversions transforming the original segment order into 
the rearranged chromosome. Note that once the breakpoint 
graph is created, all rearrangements impact only the break- 
point junctions between segments, and breakpoint reuse is 
allowed (13). 

Deleted segments are found in isolated connected com- 
ponents comprising of exactly two nodes and one segment 
edge. In Figure lb, there are four deleted segments. The last 
possibility for connected components in the graph are cy- 
cles, which represent duplicated segments. For example, a 
simple cycle with a breakpoint edge connecting the head of 
a segment to its tail is a tandem duplication. To determine 
the order of rearrangements to create the observed chromo- 
some, deletions are randomly placed in the inversion order 
and duplications are introduced last to preserve low copy 
number states. 

RESULTS 

Simulating progressive rearrangements 

Stephens et al. graciously shared the code they used to 
produce their results. We have reimplemented the method, 



applied it to chromosome 15 of SNU-C1 and replicated 
their results (Figure 3a). The general trend, consistent with 
Stephens' result, is that the number of unique copy num- 
ber states increases with the number of breakpoints. A chro- 
mosome with 239 breakpoints and only two copy number 
states falls well outside of what was produced by the pro- 
gressive simulation, and this is a key piece of evidence that 
chromosome 15 of SNU-C1 is the result of chromothripsis 
rather than progressive rearrangement. Moreover, based on 
the chart, it appears that a chromosome with at most three 
copy number states and more than 50, or perhaps even 20, 
breakpoints also falls outside of what can be achieved by 
progressive rearrangement. 

Chromothripsis footprint criteria depend on subtle simulation 
implementation details 

The above result is more meaningful if it is robust to changes 
in the implementation of the simulation. In this section, we 
alter the simulation in various ways to determine if the pro- 
posed footprint of chromothripsis remains valid when as- 
sumptions about progressive rearrangement are changed. 

The first change we made to the simulation was a correc- 
tion of a logic error that caused some simulated inversions 
to behave like duplications. The details are in the Supple- 
mentary data, but the net effect was that some operations 
that ought to have preserved existing copy numbers instead 
introduced up to two new copy number states to the chro- 
mosome. When we corrected this, the chart of copy num- 
ber states and breakpoints shifted down (Figure 3b). This 
change in result does not affect inferences about chromo- 
some 15 of SNU-C1, since 239 breakpoints and only two 
copy number states are still well outside of the simulated 
results. But, the simulated chromosomes now begin to en- 
croach upon the chromothripsis region of the graph. For 
example, the new simulation produced a chromosome with 
67 breakpoints and only 3 copy number states, which is con- 
sistent with the footprint of chromothripsis even though the 
chromosome was rearranged progressively. 

The next alteration was to the counting of breakpoints 
and copy number states. Thus far, we have been imprecise 
about the meaning of the breakpoint values on the x-axes 
of our charts. This imprecision is also found in the liter- 
ature, but there are in fact multiple ways to count break- 
points on a chromosome. One way is to count the number 
of times an abnormal adjacency appears. For example in the 
chromosome in Figure 2f , moving from left to right we find 
eight such adjacencies: a(e), e(-c), (-c)(-g), (-g)(-h), (-f)d, hg, 
g(-i) and (-h)j. This counting method was used in Figure 
3a and b. Another way to count breakpoints is to consider 
how the breakpoints would be reported by a PES experi- 
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microarrays or depth of coverage and counting the number of 
copy number states needed to cover 95% of the chromosome. 



Figure 3. Charts of number of breakpoints versus number of copy number states for simulated chromosomes. The shaded gray area indicates the boundaries 
of the footprint of cliromothripsis proposed by Stephens et al. The cell lines with number of breakpoints and copy number states as described by Stephens 
et al. are plotted as red points. The red dashed line shows the median number of copy number states for given numbers of breakpoints. The green dashed 
lines show an interval of copy number states that contains 99% of observations, (a) Results of directly reimplementing the simulation method of Stephens 
et al. (b) Results after fixing indexing issue for inversions, (c) Results counting breakpoints as they would appear from PES and counting the number of 
copy number states needed to cover 95% of the chromosome, (d) Results counting breakpoints as they would appear from microarrays or depth of coverage 
and counting the number of copy number states needed to cover 95% of the chromosome. 



ment (14). This is similar to the previous method, except 
that if an abnormal adjacency appears in the chromosome 
multiple times because of duplications, it will only appear 
once in the sequencing results. So referring back to Figure 
2f, the adjacencies hg and (-g)(-h) would count as one adja- 
cency, even though they appear twice on the chromosome. 
A third way to count breakpoints is to consider how they 
will appear in a microarray or depth of coverage experiment 
(15). This method counts breakpoints where copy number 
changes. The copy numbers in the chromosome in Figure 2e 
from left to right are 1,0,1,2,4,3,1. So, copy number changes 
six times. 

There are also multiple ways to count the number of copy 
number states in a chromosome. The first we can call 'strict'. 
With this method, we simply count the number of copy 



number states observed in the chromosome, regardless of 
how much of the chromosome is covered by any copy num- 
ber state. In Figure 2f, there are five copy states observed, 
zero through four. Another method, which we will call 're- 
laxed', counts how many copy states are needed to cover 
some fraction of the chromosome. If we use the fraction 
90%, then the relaxed number of copy states in the chro- 
mosome above is four because we can cover 90 bases us- 
ing only four copy number states. Relaxed counting of copy 
states can be appropriate for identifying chromothripsis be- 
cause it allows us to find chromosomes that are dominated 
by two or three copy number states but may have some small 
regions with other copy numbers because of subsequent al- 
terations or experimental error. 
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Table 2. Fraction of chromosomes in Figure 4a with few copy number 
states for given breakpoint counts 



Breakpoint range 


Fraction of chromosomes with two 
or three copy number states 


50-59 


12.6% 


60-69 


7.4% 


70-79 


2.6% 


80-89 


0.8% 


90-99 


0.6% 



The simulation by Stephens et al. used strict copy number 
state counting and the first breakpoint counting method, 
counting every unexpected adjacency even if duplicated. In 
contrast, the breakpoints observed in chromosome 15 of 
SNU-C1 come from PES, and the copy number state count 
of two was arrived at using relaxed counting. Microarray re- 
sults show that the chromosome has six copy number states 
using strict counting (16). 

We modified the simulation to use relaxed copy state 
counting that found how many copy number states were 
needed to cover 95% of the simulated chromosome. When 
this was combined with PES breakpoint counting, it pro- 
duced the results in Figure 3c; when combined with mi- 
croarray breakpoint counting, it produced Figure 3d. Be- 
cause of the changes in breakpoint counting, the simula- 
tions could no longer quickly produce chromosomes with 
over 100 breakpoints. Both simulations also showed a con- 
tinuation of the trend seen in Figure 3b with a narrowing 
separation between the simulated chromosomes and chro- 
mosomes bearing the footprint of chromothripsis. For ex- 
ample, of the 414 chromosomes in Figure 3c with between 
50 and 55 breakpoints, 16 (3.9%) were dominated by three 
or two copy number states. This suggests that in a screen of 
many chromosomes, the proposed footprint of chromoth- 
ripsis may produce false discoveries. 

Finally, we altered the way the simulation chooses break- 
points to introduce into the chromosome. In the original 
simulation, breakpoints were chosen uniformly randomly 
without replacement, so each remaining breakpoint had an 
equal chance of being introduced at each step. This may not 
correspond to biological reality as there may be some pref- 
erence for particular kinds of rearrangements. Specifically, 
a preference for inversions over other rearrangement types 
could lead to chromosomes with many breakpoints but few 
copy number states. To test this, we changed the simula- 
tion so that inversions were twice as likely to be chosen at 
each step compared to deletions or duplications. The results 
are in Figure 4a and b, using PES and microarray break- 
point counting, respectively. These results have many simu- 
lated chromosomes bearing the footprint of chromothripsis. 
The large fraction of chromosomes with many breakpoints 
and few copy number states (Table 2) indicates that some 
chromosomes that appear to have undergone chromothrip- 
sis could also have been produced by progressive rearrange- 
ment that favors inversions. 

The results in this section suggest that a more conser- 
vative threshold should be used to identify chromothripsis 
in order to avoid false discoveries. If the minimum num- 
ber of breakpoints were set at 100 rather than 50, much 
of the risk of false discovery we have demonstrated above 



would be diminished. However, this threshold would also 
decrease the estimate of the prevalence of chromothripsis. 
When Stephens et al. screened 746 cancer cell line copy 
number profiles for chromosomes with over 50 breakpoints 
and at most three copy number states, they found chro- 
mosomes from 18 cell lines that met these criteria. With a 
threshold of 100 breakpoints, the number of cell lines drops 
to 3. Based on this analysis, the true prevalence of chro- 
mothripsis may be less than .5% rather than the original 
estimate of 2-3%. 

Simulation method does not distinguish between progressive 
rearrangement and chromothripsis 

In the previous section, we discussed implementation de- 
tails of simulations of progressive rearrangements. We now 
turn our attention to the question of whether such simu- 
lations can provide reliable evidence for chromothripsis at 
all. In order for an experiment to provide information about 
a hypothesis, it has to produce different results when the 
hypothesis is true than when it is false. In order for simu- 
lations to demonstrate whether a chromosome could have 
been rearranged progressively, the simulations should pro- 
duce different results for progressively rearranged chromo- 
somes and chromosomes that have undergone chromoth- 
ripsis. 

The footprint of chromothripsis, many breakpoints with 
few unique copy states, is unlikely to appear in a chro- 
mosome rearranged by progressive and overlapping tan- 
dem duplications. However, it may appear in a chromo- 
some rearranged by progressive inversions and deletions. 
We simulated such a chromosome with only inversions and 
deletions. The resulting breakpoints and copy numbers are 
shown in Supplementary Figure SI. The chromosome had 
237 breakpoints and only two copy number states, zero and 
one. Even though only two kinds of rearrangements were 
used, the chromosome shows the same complex rearrange- 
ment pattern seen in chromosomes that have putatively un- 
dergone chromothripsis. 

We then applied the simulation method to the break- 
points of this chromosome and recorded the results as we 
did in Figure 3b The resulting distribution of breakpoints 
and copy number states in Figure 5 is not different from 
Figure 3b even though we know the chromosome was re- 
arranged progressively. This result casts doubt on the use- 
fulness of the simulation method to detect chromothrip- 
sis. Rather than distinguishing between chromosomes that 
shattered and chromosomes that were rearranged progres- 
sively, it always reports that chromosomes with many com- 
plex rearrangements and few copy number states are the 
product of chromothripsis even when they are not. 

Plausible progressive rearrangement schemes exist for chro- 
mosomes bearing footprint of chromothripsis 

Thus far, we have discussed in general whether some chro- 
mosomes that appear to be the product of chromothripsis 
may actually have been progressively rearranged. We now 
move from the general to the specific to see if we can find 
series of progressive rearrangements that explain particu- 
lar chromosomes that bear the footprint of chromothripsis. 
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Figure 4. Charts of breakpoints versus copy number states for simulations with an overrepresentation of inversions, (a) Result using PES breakpoint 
counting, (b) Result using microarray breakpoint counting. 
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Figure 5. Counts of breakpoints and copy number states from a simulation based on the breakpoints from simulated chromosome in Supplementary 
Figure SI. The breakpoints and copy number states of the simulated chromosome are indicated on the chart. 
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Stephens et al. singled out three chromosomes from three 
different cell lines for extensive sequencing and analysis: 
chromosome 5 from TK10, chromosome 9 from 8505C and 
chromosome 15 from SNU-C1. These chromosomes had 
55, 77 and 239 breakpoints respectively and oscillated be- 
tween two copy number states. We developed a method that 
explains these breakpoints and copy number states using 
only progressive rearrangements. 

As discussed above, one way to ensure that a chromo- 
some has no more than two copy states is to only rear- 
range it by deletions and inversions. The problem of ex- 
plaining genomic rearrangements using inversions alone, 
also known as the 'Sorting by reversals' problem, was solved 
by the Hannenhalli-Pevzner theory (17), and implemented 
in the tool GRIMM (12). In this problem, the input is a pair 
of chromosomes with identical, but highly rearranged ge- 
nomic content. The output is a sequence of inversions that 
transforms one chromosome to the other. For each of the 
three chromothripsis chromosomes, we identified arrange- 
ments of chromosomal segments that would yield the ob- 
served breakpoints using a graph traversal technique. In 
addition, some of the breakpoints support missing chro- 
mosomal segments. These missing segments were removed 
by progressively introducing deletions. Also, breakpoints 
supporting tandem duplications are resolved last to ensure 
the fewest copy number states are observed. GRIMM then 
revealed inversions that would convert the unrearranged 
chromosome into one with the observed breakpoints (see 
Materials and Methods). 

For each of the three chromosomes, we identified a se- 
quence of inversions, deletions and tandem duplications 
that yielded 100% of the experimentally observed break- 
points as well as some additional breakpoints beyond what 
was observed (Table 3). Figure 6 illustrates the result for 
chromosome 5 from TK10. Animations of the series of re- 
arrangements for each of the three chromosomes are in the 
Supplementary data. 

These series of progressive rearrangements raise poten- 
tial alternative hypotheses for the complex breakpoints and 
oscillating copy number states in these chromosomes. Thus, 
while these chromosomes may have indeed undergone chro- 
mothripsis, the observations can also be explained using 
progressive rearrangements alone. 

The alternative explanation 

Above, we have shown that simulations and the observed 
pattern of low copy number state count and high num- 
ber of breakpoints clearly cannot distinguish chromothrip- 
sis from progressive rearrangements that favor inversions 
and deletions. However, we do not claim that a particu- 
lar scheme of simple inversions and simple deletions causes 
the observed phenomenon. Inversions are the simplest gen- 
eralization of a larger class of balanced rearrangements, 
which include translocations and rearrangements with mul- 
tiple breakpoints. Specifically, define a fc-break rearrange- 
ment, as an operation that rearranges k — 1 distinct seg- 
ments of a chromosome creating k breakpoints. By this 
definition, an inversion is a specific type of 2-break rear- 
rangement, while transpositions (and inverted transposi- 
tions) are examples of 3-break rearrangements. Consistent 



with genome rearrangement theory (11, 18, 19), a /:-break re- 
arrangement can be equivalently explained by a series of 2- 
break rearrangements (reviewed in Supplementary Section 
3). Using the A:-break definition, chromothripsis can be de- 
scribed as an n-break rearrangement allowing for deletions. 
Our results show that a signature consisting only of break- 
points cannot distinguish between progressive 2-break re- 
arrangements and deletions from a one-off n-break rear- 
rangement with deletions (chromothripsis). Since 2-break 
rearrangements decompose &-break rearrangements, pro- 
gressive combinations of /c-break rearrangements for 2 < 
k < n are equally plausible explanations for the observed 
breakpoints and copy number states. As each of these alter- 
native scenarios provides a different number of rearrange- 
ment events, the number of rearrangement events cannot be 
accurately estimated using only breakpoint and copy num- 
ber data. 

DISCUSSION 

It is notoriously difficult to make sense of many cancer 
genomes due to the complexity of rearrangements. The pro- 
posal of the chromothripsis hypothesis was an important 
step forward as a possible mechanism for the creation of 
this complexity. Careful investigation of the phenomenon 
may deepen knowledge of structural variation in cancers. 

At the same time, the proposal of 'shattering and subse- 
quent reassembly' of a chromosome in a single catastrophic 
event is truly extraordinary. The invocation of chromoth- 
ripsis to explain molecular data from cancer samples must 
be done with great circumspection, and caution, even. The 
case for chromothripsis rests on the argument that there are 
some patterns of variation that progressive rearrangement 
cannot achieve. But in this paper, we have shown that pro- 
gressive rearrangements can indeed achieve patterns that, 
at first glance, would seem quite unlikely. The primary evi- 
dence supporting chromothripsis (1) is (1) high breakpoint 
count and low copy number states. We demonstrated that 
this footprint of chromothripsis, in fact, includes chromo- 
somes rearranged progressively, that simulations might al- 
ways rule out progressive rearrangement regardless of how 
the chromosome truly evolved and that it is possible to find 
progressive rearrangements that explain chromosomes that 
appear to be exemplars of chromothripsis. 

Additional criteria used to argue for a chromothripsis 
event are: (2) clustering of breakpoint locations, (3) ran- 
domness of fragment joins, (4) rearrangements affecting 
a single haplotype, (5) interspersed loss and retention of 
heterozygosity and (6) ability to walk the derivative chro- 
mosome (1,10). However, these new criteria do not pre- 
clude chromosome formation via progressive rearrange- 
ments. For example, progressive rearrangements may pro- 
duce the same pattern of (2) clustered breakpoint locations 
and (3) randomess of fragment joins. In our progressive re- 
arrangement simulations, breakpoints were sampled from 
the exemplar chromosome 15 of SNU-C1, which has an 
identical distribution of breakpoint locations and break- 
point orientations. A recent review (10) reported patterns 
(4), (5) and (6), but did not provide quantitative analysis 
of these patterns against the few chromosomes proposed to 
have undergone chromothripsis. Pattern (4) suggests that 
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Figure 6. An illustration of the result of the series of inversions and deletions for chromosome 5 of TK10. The top panel broadly shows the ordering of 
segments after rearrangement. The upper color bar shows all segments of the unrearranged chromosome colored from blue to red. The lower color bar 
shows segments with the same coloring after rearrangement. Note that some segments have been deleted so the chromosome is shorter. The middle panel 
shows the breakpoints achieved by inversions and deletions, and the lower panel shows the observed breakpoints. 



Table 3. For each of the three chromothripsis chromosomes, the number of breakpoints observed experimentally, the number of unobserved breakpoints 
that were produced by the inversions and deletions and the number of progressive rearrangement events broken down by rearrangement type 
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breakpoints falling on one chromosome versus two ho- 
mologous chromosomes is an indication of chromothripsis. 
Again, the pattern is indiscriminate since rearrangements 
may still progressively fall on a single chromosome. Ad- 
ditionally, patterns (5) and (6) are features specifically of 
rearrangements appearing on the same chromosome. (5) 
Interspersed loss and retention of heterozygosity occurs 
when segments are deleted from only one of the homolo- 
gous chromosomes. As the other homologous chromosome 
is intact, segments that are deleted appear to have loss of 
heterozygosity and remaining segments retain heterozygos- 
ity. The property of (6) walking the derivative chromosome 
states that a set of chromothripsis breakpoints when pro- 
jected onto a reference chromosome will allow for an un- 
ambiguous walk from one end of the reference chromo- 
some to the other end of the chromosome traversing all the 
observed breakpoints. If both homologous chromosomes 
have rearrangements and since it is not known which break- 
points arise from which chromosome, finding an unambigu- 
ous walk is typically not possible. Along the reference there 
would be two walks and for each walk, inference would have 
to be made on which breakpoints to follow. However, if only 
one of the homologous chromosomes is rearranged, a sin- 
gle unambiguous walk is possible using all the breakpoints 
supporting the single rearranged chromosome. In our anal- 
ysis, like Stephens et al. , we assume progressive rearrange- 
ments are occurring on the same chromosome and likewise 
the proposed patterns (4), (5) and (6) are inherently repro- 
duced with progressive rearrangements in our simulations 
and plausible explanations of documented chromothripsis 
chromosomes. 

In balance, our results suggest it is difficult to point to 
statistical evidence that predicts chromothripsis while ex- 
cluding other possibilities. In this manuscript, we do not 
delve into biological explanations of the observed rear- 
rangement patterns. For example, if rearrangements accu- 
mulate progressively, how could they be limited to a sin- 
gle chromosome. Is not a one-off catastrophic chromoth- 
ripsis event a better explanation than some 'memory' that 
causes the same chromosome (or a few chromosomes) to 
be dramatically rearranged over time? However, there is ev- 
idence that chromosomal lesions might make a chromo- 
some more susceptible to mutations. An example is pro- 
vided by the breakage-fusion-bridge mechanism where the 
loss of a telomere and resulting instability leads to pro- 
gressive cycles of rearrangements. Indeed, Sorzano et al. 
propose the breakage-fusion-bridge and similar progressive 
mechanisms show patterns similar to chromothripsis (20). 
Chiang et al. (21) showed transgene integration in germline 
cells makes a chromosome more susceptible to rearrange- 
ments and the resulting complex rearrangements have chro- 
mothripsis like patterns. While they suggest the rearrange- 
ments appear in a one-off event, we note that there are 
still numerous cell divisions between pronucleus injection 
of exogenous DNA and harvesting of DNA for sequencing. 
Based on this, we cannot rule out that rearrangements ac- 
crued over a few cell divisions. The rearrangements do not 
need to keep occurring in all cell divisions as well. For ex- 
ample, breakage-fusion-bridge cycles occur across distinct 
cell divisions, but eventually lead to a stable rearranged 
chromosome. Similarly, Liu et al. ((22), Table 1) provide 



numerous examples of complex rearrangements occurring 
on a single chromosome. They suggest that the results are 
best explained not as chromothripsis, but due to a 'chro- 
moanasynthesis', as they involve errors in replicative mech- 
anisms and show duplications and triplications not seen 
in chromothripsis. They provide the example of a patient 
with 18 copy number changes including a 5.5 Mbp tripli- 
cated and inverted segment. Breakpoint analysis revealed 
insertions of long (1.5Kbp) novel sequences at breakpoints, 
which might provide a template switch during the replicative 
process (22). Once we admit these possibilities, however, the 
difference between one-off and progressive events becomes 
harder to measure. 

We find purely statistical evidence cannot distinguish 
between one-off and progressive events (see also (23,20)). 
Other authors have noted that 'chromothripsis like events' 
cannot be distinguished from other complex genomic re- 
arrangements (24). While we use inversions and deletions 
to explain the chromothripsis patterns, we do not claim 
a particular scheme of simple inversions and simple dele- 
tions causes the phenomenon. Inversions are the simplest 
generalization of a larger class of balanced rearrange- 
ments, which include translocations and rearrangements 
with multiple breakpoints (see Supplementary Section 3). 
In other words, the chromothripsis pattern could also be 
explained by progressive steps of balanced rearrangements 
like translocations. Lastly, the appearance of balanced rear- 
rangements and deletions is a plausible scenario for cancer. 
Wang et al. (25) analyzed five T-ALL samples and observed 
31 interchromosomal translocations, 19 intrachromosomal 
translocations, 1 inversions, 22 deletions and 16 insertions. 
While they did not describe copy number changes, the num- 
ber of breakpoints from possible copy neutral events is high 
(31 + 19+1 x 2 = 52 breakpoints). Thus, it is reasonable 
to find a few extreme cancer chromosomes having a higher 
number of balanced rearrangements and deletions. 

CONCLUSION 

These results do not foreclose upon the chromothripsis hy- 
pothesis, of course. But, they do underscore difficulty of 
making inferences about mechanisms in cancer. Indeed, 
there is no doubt that some of the cancer genomes have 
undergone extensive rearrangements. At the same time, the 
evidence is limited for the claim that a single catastrophic 
event joined shattered DNA together, and requires addi- 
tional investigation before it can be accepted as established 
fact. For now, chromosomes with many breakpoints should 
be labeled as having undergone complex genome rearrange- 
ments rather than implying a shattering mechanism by 
chromothripsis. Future advances in single-cell sequencing 
and haplotype resolved genome assembly might shed light 
on the mechanisms underlying complex rearrangements. 
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